Rank | Count | Beginning |
---|---|---|
18 | 200 | 이 |
8 | 130 | 또한 |
35 | 120 | 그러나 |
99 | 85 | 하지만 |
12 | 82 | 그리고 |
458 | 81 | 그 |
199 | 77 | 또 |
21 | 71 | 이러한 |
16 | 67 | 특히 |
13 | 59 | " |
45 | 59 | 따라서 |
118 | 56 | 1. |
240 | 50 | 2. |
876 | 47 | 3. |
227 | 42 | 본 |
137 | 41 | 이에 |
128 | 39 | 제 |
1055 | 38 | 그래서 |
60 | 37 | 4. |
436 | 37 | ② |
53 | 36 | 이번 |
265 | 35 | 그런데 |
344 | 35 | 현재 |
504 | 35 | 물론 |
777 | 34 | 이는 |
451 | 33 | 우리 |
1044 | 33 | 이런 |
109 | 32 | ” |
116 | 31 | 지난 |
232 | 30 | ③ |
In the next four subsections show the most frequent sentence beginnings consisting of N words, N=1, 2, 3, 4. In this subsection we start with N=1.
The most frequent word-N-grams at the beginning of sentences give some insight into sentence composition.
Especially for N=1, we only need a small corpus to identify the most frequent sentence beginnings.
select substring_index(sentence, ' ', 1) as beg, count(*) as cnt from sentences group by substring_index(sentence, ' ', 1) order by cnt desc limit 50;
4.3.1.2 Most Frequent Sentence Beginnings II
4.3.1.3 Most Frequent Sentence Beginnings III
4.3.1.4 Most Frequent Sentence Beginnings IV
4.3.1.1 Most Frequent Sentence Endings I
4.3.1.2 Most Frequent Sentence Endings II
4.3.1.3 Most Frequent Sentence Endings III
4.3.1.4 Most Frequent Sentence Endings IV